Abstract

This project investigates the racial gerrymandering in Alabama’s 2021 and 2023 congressional districts using area weighted reaggregation of census blockgroup data to convex hull and minimum bounding circle geometries. Compactness scores for each district are calculated and compared.

Study metadata

Study design

This is an original study based on literature on gerrymandering metrics.

It is an exploratory study to evaluate usefulness of new gerrymandering metrics based on representativness in the convex hull and minimum bounding circle compared to the congressional district.

Materials and procedure

Computational environment

I plan on using package … for …

Data and variables

We plan on using data sources precincts20, districts21, & districts23 from the districts gpkg and blockgroups2020 from the 2020 Decennial Census.

## Driver: GPKG 
## Available layers:
##    layer_name geometry_type features fields crs_name
## 1 districts21 Multi Polygon        7      4   WGS 84
## 2 districts23 Multi Polygon        7      4    NAD83
## 3 precincts20 Multi Polygon     1972      8    NAD83

Precincts 2020

  • Title: Voting Precincts 2020
  • Abstract: Alabama 2020 voting precincts with election data (votes for Biden and Trump)
  • Spatial Coverage: Alabama
  • Spatial Resolution: voting precincts
  • Spatial Reference System: EPSG 4269 NAD 1983 geographic coordinate system
  • Temporal Coverage: voting precincts used for tabulating the 2020 election
  • Temporal Resolution: annual election
  • Lineage: Saved a sgeopackage format. Processing prior to download is explained in al_vest_20_validation_report.pdf
  • Distribution: Data available at Redistricting Data Hub with free login
  • Constraints: Permitted for noncommercial and nonpartisan use only. Copyright and use constraints explained in redistrictinghub_legal.txt
  • Data Quality: n/a
  • Variables: For each variable, enter the following information. If you have two or more variables per data source, you may want to present this information in table form (shown below)
    • Label: variable name as used in the data or code
    • Alias: intuitive natural language name
    • Definition: Short description or definition of the variable. Include measurement units in description.
    • Type: data type, e.g. character string, integer, real
    • Accuracy: e.g. uncertainty of measurements
    • Domain: Expected range of Maximum and Minimum of numerical data, or codes or categories of nominal data, or reference to a standard codebook
    • Missing Data Value(s): Values used to represent missing data and frequency of missing data observations
    • Missing Data Frequency: Frequency of missing data observations: not yet known for data to be collected
Label Alias Definition Type Accuracy Domain Missing Data Value(s) Missing Data Frequency
VTDST20 VDID voting district ID
GEOID20 GEOID unique geographic ID
G20PRETRU trump total votes for Trump in 2020
G20PREBID biden total votes for Biden in 2020

Districts 2023

## Reading layer `districts23' from data source 
##   `/Users/m/Desktop/School/Middlebury/Senior/spring25/Open/github/OR-Gerrymander-Alabama/data/raw/public/alabama/districts.gpkg' 
##   using driver `GPKG'
## Simple feature collection with 7 features and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -88.47323 ymin: 30.14443 xmax: -84.88825 ymax: 35.00803
## Geodetic CRS:  NAD83

Map the districts

## 
## ── tmap v3 code detected ───────────────────────────────────────────────────────
## [v3->v4] `tm_text()`: migrate the layer options 'just' to 'options =
## opt_tm_text(<HERE>)'
## [tm_text()] Argument `on_surface` unknown.

Blockgroups 2020

  • Title: blockgroups2020
  • Abstract: vector polygon geopackage layer of census tracts and demographic data
  • Spatial Coverage: Alabama
  • Spatial Resolution: Census block groups
  • Spatial Reference System: EPSG:4269 NAD 1983 geographic coordinate system
  • Temporal Coverage: 2020 census
  • Temporal Resolution: n/a
  • Lineage: taken from U.S. Census API “pl” public law summary file using tidycensus in R
  • Distribution: U.S. Census API
  • Constraints: Public Domain data free for use and redistribution
  • Data Quality: n/a
  • Variables: For each variable, enter the following information. If you have two or more variables per data source, you may want to present this information in table form (shown below)
    • Label: variable name as used in the data or code
    • Alias: intuitive natural language name
    • Definition: Short description or definition of the variable. Include measurement units in description.
    • Type: data type, e.g. character string, integer, real
    • Accuracy: e.g. uncertainty of measurements
    • Domain: Range (Maximum and Minimum) of numerical data, or codes or categories of nominal data, or reference to a standard codebook
    • Missing Data Value(s): Values used to represent missing data and frequency of missing data observations
    • Missing Data Frequency: Frequency of missing data observations
Label Alias Definition Type Accuracy Domain Missing Data Value(s) Missing Data Frequency
GEOID code to uniquely identify tracts
P4_001N Total Population, 18 years or older
P4_006N Total: Not Hispanic or Latino, Population of one race, Black or African American alone, 18 years or older
P5_003N Total institutionalized population in correctional facilities for adults, 18 years or older

Acquire census block group data using tidycensus package. First query metadata for the pl public law data series.

In the 2023 court cases on Alabama’s redistricting, it was argued that Alabama had a racial gerrymander discriminating against Black and African American voters. Therefore, we will analyze data on the voting age population based on Black and non-black voters. This data is found in table P3.

Query the public law data series table P3 on “race for the population 18 years and over”.

## Reading layer `block_groups' from data source 
##   `/Users/m/Desktop/School/Middlebury/Senior/spring25/Open/github/OR-Gerrymander-Alabama/data/raw/public/block_groups.gpkg' 
##   using driver `GPKG'
## Simple feature collection with 3925 features and 83 fields (with 1 geometry empty)
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -88.47323 ymin: 30.22333 xmax: -84.88908 ymax: 35.00803
## Geodetic CRS:  NAD83

Prior observations

Previously, racial and voting data were analyzed for 2016 and 2020 voting districts. Compactness was assessed using the ratio of area to perimeter squared to determine relative shape to a circle.

Prior to analysis, only metadata have been observed for each data source.

Bias and threats to validity

This study is an exploration of the modifiable aerial unit problem (MAUP). Compactness is arbitrarily defined and fairness (presence of a racial or political gerrymander) is also subjective. The bounding circle and convex hull methods are an attempt to apply different metrics and analyze results to form a more robust understanding of compactness and fairness in Alabama voting districts. However, geographic features such as natural boundaries (coastlines, rivers, mountain ranges etc.) and urban settlement patterns complicate the use of universal metrics like the ones explored in the study.

Data transformations

Districts 2023

Reproject CRS to ESPG: 4269 and calculate Black population.

Calculate geometry area.

Blockgroups 2020

Reproject CRS to ESPG: 4269.

Extract all P3 reporting categories with people who identify as Black.

X name label
151 P3_004N !!Total:!!Population of one race:!!Black or African American alone
158 P3_011N !!Total:!!Population of two or more races:!!Population of two races:!!White; Black or African American
163 P3_016N !!Total:!!Population of two or more races:!!Population of two races:!!Black or African American; American Indian and Alaska Native
164 P3_017N !!Total:!!Population of two or more races:!!Population of two races:!!Black or African American; Asian
165 P3_018N !!Total:!!Population of two or more races:!!Population of two races:!!Black or African American; Native Hawaiian and Other Pacific Islander
166 P3_019N !!Total:!!Population of two or more races:!!Population of two races:!!Black or African American; Some Other Race
174 P3_027N !!Total:!!Population of two or more races:!!Population of three races:!!White; Black or African American; American Indian and Alaska Native
175 P3_028N !!Total:!!Population of two or more races:!!Population of three races:!!White; Black or African American; Asian
176 P3_029N !!Total:!!Population of two or more races:!!Population of three races:!!White; Black or African American; Native Hawaiian and Other Pacific Islander
177 P3_030N !!Total:!!Population of two or more races:!!Population of three races:!!White; Black or African American; Some Other Race
184 P3_037N !!Total:!!Population of two or more races:!!Population of three races:!!Black or African American; American Indian and Alaska Native; Asian
185 P3_038N !!Total:!!Population of two or more races:!!Population of three races:!!Black or African American; American Indian and Alaska Native; Native Hawaiian and Other Pacific Islander
186 P3_039N !!Total:!!Population of two or more races:!!Population of three races:!!Black or African American; American Indian and Alaska Native; Some Other Race
187 P3_040N !!Total:!!Population of two or more races:!!Population of three races:!!Black or African American; Asian; Native Hawaiian and Other Pacific Islander
188 P3_041N !!Total:!!Population of two or more races:!!Population of three races:!!Black or African American; Asian; Some Other Race
189 P3_042N !!Total:!!Population of two or more races:!!Population of three races:!!Black or African American; Native Hawaiian and Other Pacific Islander; Some Other Race
195 P3_048N !!Total:!!Population of two or more races:!!Population of four races:!!White; Black or African American; American Indian and Alaska Native; Asian
196 P3_049N !!Total:!!Population of two or more races:!!Population of four races:!!White; Black or African American; American Indian and Alaska Native; Native Hawaiian and Other Pacific Islander
197 P3_050N !!Total:!!Population of two or more races:!!Population of four races:!!White; Black or African American; American Indian and Alaska Native; Some Other Race
198 P3_051N !!Total:!!Population of two or more races:!!Population of four races:!!White; Black or African American; Asian; Native Hawaiian and Other Pacific Islander
199 P3_052N !!Total:!!Population of two or more races:!!Population of four races:!!White; Black or African American; Asian; Some Other Race
200 P3_053N !!Total:!!Population of two or more races:!!Population of four races:!!White; Black or African American; Native Hawaiian and Other Pacific Islander; Some Other Race
205 P3_058N !!Total:!!Population of two or more races:!!Population of four races:!!Black or African American; American Indian and Alaska Native; Asian; Native Hawaiian and Other Pacific Islander
206 P3_059N !!Total:!!Population of two or more races:!!Population of four races:!!Black or African American; American Indian and Alaska Native; Asian; Some Other Race
207 P3_060N !!Total:!!Population of two or more races:!!Population of four races:!!Black or African American; American Indian and Alaska Native; Native Hawaiian and Other Pacific Islander; Some Other Race
208 P3_061N !!Total:!!Population of two or more races:!!Population of four races:!!Black or African American; Asian; Native Hawaiian and Other Pacific Islander; Some Other Race
211 P3_064N !!Total:!!Population of two or more races:!!Population of five races:!!White; Black or African American; American Indian and Alaska Native; Asian; Native Hawaiian and Other Pacific Islander
212 P3_065N !!Total:!!Population of two or more races:!!Population of five races:!!White; Black or African American; American Indian and Alaska Native; Asian; Some Other Race
213 P3_066N !!Total:!!Population of two or more races:!!Population of five races:!!White; Black or African American; American Indian and Alaska Native; Native Hawaiian and Other Pacific Islander; Some Other Race
214 P3_067N !!Total:!!Population of two or more races:!!Population of five races:!!White; Black or African American; Asian; Native Hawaiian and Other Pacific Islander; Some Other Race
216 P3_069N !!Total:!!Population of two or more races:!!Population of five races:!!Black or African American; American Indian and Alaska Native; Asian; Native Hawaiian and Other Pacific Islander; Some Other Race
218 P3_071N !!Total:!!Population of two or more races:!!Population of six races:!!White; Black or African American; American Indian and Alaska Native; Asian; Native Hawaiian and Other Pacific Islander; Some Other Race

Next, sum all of the categories of populations who identify as Black. bgarea calculates the geographic area of each blockgroup Total is the total population 18 years or older PctBlack is the percentage of Total that is Black CheckPct sums the calculated Black population and the white population and divides by the total representing percent of the population considered in this analysis. This should be a close underestimate (~100) in Alabama.

Save the results.

## Deleting layer `blockgroups_calc' using driver `GPKG'
## Writing layer `blockgroups_calc' to data source 
##   `/Users/m/Desktop/School/Middlebury/Senior/spring25/Open/github/OR-Gerrymander-Alabama/data/derived/public/blockgroups_calc.gpkg' using driver `GPKG'
## Writing 3925 features with 6 fields and geometry type Multi Polygon.
## ℹ tmap mode set to "plot".

And plot results to visualize.

Now, we visualize the 2023 districts overlaid on blockgroups by pctBlack

## ℹ tmap mode set to "view".
## Registered S3 method overwritten by 'jsonify':
##   method     from    
##   print.json jsonlite
## Variable bgcol and bgcol_alpha not supported by view mode

Analysis

Use area weighted reaggregation (AWR) to determine the voting age Black and total populations in each district using from st_intersection(). Next, AWR will be repeated on the convex hull and minimum bounding circle geometries. Compactness scores will be calculated for each analysis geometry.

Research is exploratory so there are no statistical tests or weighting criteria.

2023 districts

Estimate voting populations by race for each 2021 district

## Warning: attribute variables are assumed to be spatially constant throughout
## all geometries

report results

DISTRICT POPULATION WHITE BLACK pctBlack area bgBlack bgTotal pctBlackbg
1 717754 527330 116462 16.2 21047105198 [m^2] 90288.76 [1] 556559.0 [1] 16.2 [1]
2 717754 303461 353228 49.2 24768106013 [m^2] 272078.98 [1] 559640.2 [1] 48.6 [1]
3 717754 508080 146376 20.4 17801383934 [m^2] 116778.64 [1] 564183.0 [1] 20.7 [1]
4 717754 585183 49721 6.9 22936713017 [m^2] 42522.09 [1] 558441.9 [1] 7.6 [1]
5 717754 495427 126226 17.6 10534064156 [m^2] 102750.67 [1] 560573.5 [1] 18.3 [1]
6 717755 517634 125785 17.5 11802712695 [m^2] 96865.75 [1] 552556.9 [1] 17.5 [1]
7 717754 283337 378364 52.7 26953169812 [m^2] 293014.56 [1] 564941.7 [1] 51.9 [1]

Estimate voting populations by race using convex hull and join to blockgroup estimates

## Warning: attribute variables are assumed to be spatially constant throughout
## all geometries

Estimate voting populations by race using minimum bounding circle and join to blockgroup estimates

calculate compactness scores for each geometry

Results

Correlation matrix and small plots by gerrymandering indicators

pctBlackbg diffPct absdiffPct compact_shp compact_ch compact_mbc
pctBlackbg 1.0000000 0.8656916 0.2755931 -0.2814174 -0.1247704 0.0389629
diffPct 0.8656916 1.0000000 -0.0363323 0.1128061 0.1850875 0.1185339
absdiffPct 0.2755931 -0.0363323 1.0000000 -0.6801495 -0.5295192 0.2456776
compact_shp -0.2814174 0.1128061 -0.6801495 1.0000000 0.9557498 0.4197650
compact_ch -0.1247704 0.1850875 -0.5295192 0.9557498 1.0000000 0.5039445
compact_mbc 0.0389629 0.1185339 0.2456776 0.4197650 0.5039445 1.0000000

Plot difference in representation (between district and convex hull) against compactness scores

## `geom_smooth()` using formula = 'y ~ x'

There is a negative correlation between compactness score and percent absolute difference for both the convex hull and shapefile geometry calculations. The deviation from the regression is also relatively small, indicating that there is more accuracy to this trend. The bounding circle results, however, were less consistent. They showed a slight positive increase in absolute difference. Compactness score for district 5 decreased while scores for districts 4, 6, and 7 increased. Looking at district 5, it sits along a flat, northern boundary. Because it does not extend north, its score decreases despite the relative lack of obvious gerrymandering. For districts 4, 6, and 7, we can see how obvious deviations in their geometries are hidden in the boundary circle method as they contain deviations in different directions that make their overall shape more circular.

Discussion

Negative correlation between compactness score and absolute difference indicates a gerrymander. Districts that were the least compact were also the least representative. In a world without gerrymandering, differences in compactness would correspond with natural boundaries and thus not be associated with less representation. Bounding circle results yielded the least compelling results as they were unable to detect and represent clear intentional gerrymandering in district 7 and possibly districts 4 and 6 as well. Ultimately, we see that racial gerrymandering remains in 2023 despite redistricting.

Integrity Statement

The authors of this preregistration state that they completed this preregistration to the best of their knowledge and that no other preregistration exists pertaining to the same hypotheses and research.

Acknowledgements

This report is based upon the template for Reproducible and Replicable Research in Human-Environment and Geographical Sciences, DOI:[10.17605/OSF.IO/W29MQ](DOI:%5B10.17605/OSF.IO/W29MQ){.uri}

References